home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Libris Britannia 4
/
science library(b).zip
/
science library(b)
/
MATH
/
POLYSIM.ZIP
/
POLYSIM.DOC
< prev
next >
Wrap
Text File
|
1989-05-22
|
10KB
|
190 lines
POLYSIM - Fits experimental data to a large polynomial equation using
a modified simplex approach.
Version 1.1
D.J.Murphy 1989
Usage: polysim [min_index max_index] inputfile outputfile
Description: POLYSIM is designed to be run either on a PC in a "fire & forget"
manner, or on a Unix (or other) host capable of supporting
background processes. This is because optimization can be a time
consuming business, and so once invoked no other input is required
providing the data has been entered into the input file correctly.
It fits data to the equation:
y = ax^5 + bx^4 + cx^3 + dx^2 + ex + f + g/x + h/(x^2)
using a modified simplex method (see below) to optimize the
coefficients a through h such that the mean deviation of Y(input)
from Y(calculated) is minimized.
Alternatively, any part of this equation can be fitted by entering
minimum and maximum values for the X-index on the command line.
Restrictions here are that the indices must either both be entered
or not at all (not one alone), and must be entered minimum then
maximum. The indices, I, must be integers and 5 >= I >= -2.
Using this option it is possible to use an educated guess at the
general form of the function your data should fit to restrict the
amount of work POLYSIM has to do, and thus reduce the time it will
take to run.
Output is via a text file, and includes number of iterations used,
values of all the coefficients, mean deviation and a list of
input data, Y(calculated) and the differences.
Index Range: If you do not include this, the default range of X^5 to X^(-2)
will be used. If you want to restrict this, enter the minimum
index value you want, then the maximum, as integers.
Input file: The input file must contain the input data as delimited
X Y pairs, preferably one pair per line:
eg. 1 1
2 4
3 9
4 16
| |
Do not include anything else in this dataset. Try to avoid
including x=0 values, as data pairs with x = 0 are ignored on
reading the file (this to avoid divide-by-zero errors when
evaluating Y(calculated)). Any valid filename is accepted.
Output file: Any valid filename is accepted. Any existing file of that name
will be overwritten, unless it is write-protected in which case
POLYSIM will fail AFTER running the optimization.
Example: C:\polysim 0 2 expt1.dat expt1.res
Run POLYSIM on the data in C:\EXPT1.DAT, optimizing to the
function y = ax^2 + bx + c, and write the results to the
file C:\EXPT1.RES which can be listed to the screen, printer
or included in other programs.
Cautions: This is an optimization program - it is wise to plot the data
visually before using it, as the fit POLYSIM reports may be better
for some parts of the dataset than others. It is unwise to
extrapolate too much beyond the range of the given data set.
POLYSIM also takes a while to run - you are advised to try using
a log(X) - log(Y) least squares method first to try and get a
straight line, unless you either know that this won't give you
a straight line or you are desperate for a curve.
Also, not all data sets will give reasonable curves - again, try
plotting first to see if this is likely to be so - if not the
answer (if you get one) is likely to be meaningless.
Portability: POLYSIM is written in C, and was developed on a PC using
Borland's Turbo C v1.5, and compiled using floating point
emulation so it will use an 80x87 math coprocessor if one is
present but it does not require it (use of such is highly
recommended, though - POLYSIM makes extensive use of
double precision real numbers and as such will be very slow
without a floating point processor).
No odd facilities are used in the source code, so it should
be portable onto a Unix system, or to any ANSI C compiler -
it _should_ be - I'm making no promises.
What is "modified simplex" optimization?
Simplex is a formal procedure for optimizing a set of parameters in a case
where you can measure responses for different combinations of parameter values
but have no knowledge of the "response surface" which gives rise to the
observed pattern - i.e. it is not possible to use calculus to find a maximum
or minimum. A simplex is a geometrical figure which has one more vertex than
the number of dimensions it occupies - a 2 dimensional simplex is a triangle,
for example. POLYSIM uses up to an 8 dimensional simplex, one for each
coefficient, so needs up to 9 equations. Consider the 2d one for illustration.
We have some response, z, such that z = f(x,y), but we don't know what. In the
basic simplex method, we test 3 points, B, N and W, and get the Best, Next best
and Worst responses respectively. We then take a line between B and N and find
the centre of it, C - then reflect W through that centre point to get a new
point, A :
The figure BNW is the old simplex, and BNA is the
new one. If we now test the response at A, and compare
B it with those at B and N, we can repeat the process.
Doing this often enough will home the simplex in on
the optimum response (think about it).
W C A
N
However, the problem with this is 3 fold. Firstly, the simplex stays the same
size, so there is a limit to how accurately we can home in on the optimum. Also
because of this, large "distances" on the x,y surface are covered fairly slowly
if we make the figure small enough to get a reasonable resolution.
The third problem is that, other than looking for repetition of the position
of the simplex, there is little opportunity for detecting when the optimum has
been reached and thus when to stop the process.
The answer to these problems comes with "modified" or "accelerated" simplex,
which is as above with additions:
B Now we reflect to find A, as before, but
this time we check the response at A first
and modify the simplex:
W F C G A H If response at A is BETTER than B, we're
obviously going in the right general
direction, so double the W translation and
N make the new simplex BNH instead.
If response at A is between B and N, stay at A, but if it is between W and N,
we must be careful not to overshoot, so the new simplex is BNG. If response at
A is worse than W, the new simplex is BNF.
In practice, this means that the simplex gets bigger to cover large "distances"
toward the optimum, then shrinks itself around it to home in with a higher
resolution. This gives a method of stopping the process as we can elect to
terminate the simplex loop when it has got below a certain size (for example,
0.1% of the total Y-range - the sort of thing POLYSIM does).
POLYSIM uses this this method, but with up to 8 dimensions, to optimize the
combination of coefficients such that |(y(calculated) - y(given))| is minimized.
The point at which it stops, and hence the resolution, is set within the
program, but can be changed with recompilation.
Acknowledgements:
What I know about simplex optimization comes from writing a teaching program
in 1988 for the University of Edinburgh Chemistry Department. POLYSIM is an
application based upon experience gained in that project, which was carried
out for Drs. Kenneth Lawley and Ian Gosney.
The paper I was referred to was:
S.N. Deming & S.L. Morgan, Analytical Chemistry vol.45, 278A (1973),
and further references can be found at the end of this, though I have not
looked at them.
Disclaimer and such like:
This was written by me for me. I'm not going to accept any hassle over it
since I have not ripped anyone off and am distributing it because it has
come to my attention that some people might find it useful.
I've included source code as well as IBM PC executable in the archive
supplied, for those who want to play with it (and make improvements which
are possible but which escape me at present) or port it to other machines.
I really don't care if anyone changes anything, but unless you're going to
improve it there is not much point. If you do improve it, could you please
mail them to me at one of the addresses below with a description of the
modification. Anything implemented will be forwarded with acknowledgements.
The same goes for constructive criticism. Flame will go to /dev/null.
Contact:
Anyone wishing to contact me about this program can do so at the following
addresses:
SnailMail: D.J. Murphy
Room 213
Chemistry Department
Edinburgh University
King's Buildings
West Mains Road
Edinburgh
Scotland
EH9 3JJ
Email:
JANET ARPA
djm@uk.ac.edinburgh.etive djm%ed.etive@nsfnet-relay.ac.uk
or or
Murff@uk.ac.edinburgh Murff%ed.emas-a@nsfnet-relay.ac.uk